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Abstract 

Let S{A) denote the orbit of a complex or real matrix A under a certain equivalence relation 
such as unitary similarity, unitary equivalence, unitary congruences etc. Efficient gradient-flow 
algorithms are constructed to determine the best approximation of a given matrix by the sum 
of matrices in S{Ai), . . . , S{An) in the sense of finding the Euclidean least-squares distance 

min{||Xi + --- + X^-^o|| -.XjeSiAj), j = l,...,N}. 

Connections of the results to different pure and applied areas are discussed. 
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1 Introduction 

Motivated by problems in pure and applied areas, there has been a great deal of interest in studying 
equivalence classes on matrices, say, under compact Lie group actions. For instance, 

(a) the unitary (orthogonal) similarity orbit of a complex (real) square matrix A is the set of 
matrices of the form UAU* for unitary (or real orthogonal) matrices U, 

(b) the unitary (orthogonal) equivalence orbit of a complex (real) rectangular matrix A is the 
set of matrices of the form UAV for unitary (orthogonal) matrices U, V of appropriate sizes, 

(c) the unitary t-congruence orbit of a complex square matrix A is the set of matrices of the 
form UAU^ for unitary matrices U, 

(d) the orthogonal similarity orbit of a complex square matrix A is the set of matrices of the 
form QAQ* for complex orthogonal matrices Q, i.e., Q^Q = In-, 

(e) the similarity orbit of a square matrix A is the set of matrices of the form SAS~^ for 
invertible matrices S. 
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It is often useful to determine whether a matrix Aq can be written as a sum of matrices from 
orbits S{Ai), . . . , S{A]\f). Equivalently, one would like to know whether 

S{Ao)cS{Ai) + --- + S{An). 

For = 1, it reduces to the basic problem of checking whether is equivalent to Ai. In some cases, 
even this is non-trivial. For instance, it is not easy to check whether two n x n complex matrices 
are unitarily similar. For A'^ > 1, the problem is usually more involved. Even if there are theoretical 
results, it may not be easy to use them in practice or checking examples of matrices of moderate 
sizes. For instance, given 10 x 10 Hermitian matrices A, B, C, to conclude that C = UAU* + VBV* 
for some unitary matrices U and V, one needs to check thousands of inequalities involving the 
eigenvalues of A, B, and C; see [12]. Therefore, one purpose of this paper is to set up a general 
framework to develop efficient computer algorithms and programs to solve such problems. In fact, 
we will treat the more general problem of finding the best approximation of a given matrix Aq 
by the sum of matrices from matrix orbits S{Ai), . . . , S{An)- In other words, for given matrices 
Aq, Ai, . . . ,An, we determine 

min{||Xi + ... + XN-AQ\\:{Xi,...,XN)e S{Aq) x • • • x S{An)} . 

The results will be useful in solving numerical problems efficiently, and helpful in testing conjectures 
of theoretical development of the topics under considerations. As we will see in the following 
discussion, some numerical examples indeed lead to general theory; see Section 3.] 

We will consider different matrix orbits in the next few sections. In each case, we will mention 
the motivation of the problems and derive the gradient flows for the respective orbits, which will 
be used to design the algorithms and computer programs to solve the optimization problem. Note 
that we always consider the orbits of similarity SAS~^ and equivalence SAT, where {S, T} can be 
elements of any semisimple compact connected matrix Lie group, in particular the special unitary 
group SU (n) and subgroups thereof. Since these matrix Lie groups are compact, they are themselves 
smooth Riemannian manifolds M, which in turn implies they are endowed with a Riemannian metric 
induced by the non-degenerate Killing form related to a bi- invariant scalar product {■\-)x on their 
tangent and cotangent spaces T^M and T*M. The metric smoothly varies with x G M and allows 
for identifying the Frechet differential in T*M with the gradient in TxM. Moreover, in Riemannian 
manifolds the existence and convergence of gradient flows with appropriate discretization schemes 
are elaborated in detail in Ref. [30]. In the present context, it is important to note that the 
subsequent gradient flows on the unitary congruence orbit and the unitary equivalence orbit are 
fundamental. The flows on compact connected subgroups of SU{n) such as SO{n) or 5f7(2)®™ 
(with 2*" = n) can readily be derived from the flows on SU{n) [291 130] . Furthermore, in each case, 
we will provide numerical examples to illustrate their efficiency and accuracy. 

The situation in the general linear group GL{N) and its subgroups that are not in the inter- 
section with the unitary groups is entirely different: those groups are no longer compact, but only 
locally compact. For GL{N) orbits we give an outlook with some analytical results in infinma 
of Euclidean distances. Since locally compact Lie groups lack bi-invariant metrics on the tangent 
spaces to their orbit manifolds, they can only be endowed with left-invariant or right-invariant 
metrics. Moreover, the exponential map onto locally compact Lie groups is no longer geodesic as 
in the compact case. Consequently, one will have to devise other approximations to the respective 
geodesies than obtained by the (Riemannian) exponential. These numerics are thus a separate 
topic of current research and will therefore be pursued in a follow-up study. 

With regard to notation, unless stated otherwise, the norm \ \A\\ shall always be read as Frobe- 

nius norm \\A\\2 := y^tr 
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2 Unitary Similarity Orbits 



2.1 The Hermitian Matrix Case 

For an n X n Hermitian matrix A, let S{A) be the set of matrices unitarily similar to A. Then 

S{A) + S{B) = {X + Y ■.{X,Y)£ SiA) x 5(5)} 

is a union of unitary similarity orbits. Researchers have determined the necessary and sufficient 
conditions of S{C) to be a subset of S{A) + S{B) in terms of the eigenvalues of A,B and C; 
[HIEllinilllllMIISllMllSl]- In particular, suppose A, B, C have eigenvalues 

ai > • • • > a„, hi >■■■ >bn-, and ci > • • • > Cn, 

respectively. Then S{C) C S{A) + S{B) if and only if 

n 

^(a, + 6,- - c,) = (2.1) 

and a collection of inequalities in the form 

^a, + ^6, > J^^Cf (2.2) 

rGfi se5 ter 

for certain m element subsets R, S,T C {1, . . . , n} with 1 < m < n determined by the Littlewood- 
Richardson rules; see [1U\ [T^ for details. The study has connections to many different areas such as 
representation theory, algebraic geometry, and algebraic combinatorics, etc. Note that the relation 
between Horn's problem and the Littlewood-Richardson rules has recently also attracted attention 
in quantum information [8]. The set of inequalities in (|2.2p grows exponentially with n. Therefore, 
it is not easy to check the conditions even for a moderate size problem, say, for 10 x 10 Hermitian 
matrices. As a matter of fact, the theory has been extended to determine whether S{Aq) is a subset 
of S{Ai) + • • • + S{Ai^) for given n x n Hermitian matrices Aq, . . . ,An, in terms of equality and 
linear inequalities of the eigenvalues of the given matrices. Of course, the number of inequalities 
involved are more numerous. There does not seem to be an efficient way to use these results in 
practise or testing numerical examples or conjecture in research. 

It is interesting to note that by the saturation conjecture (theorem) (see [3] and its references), 
there exist Hermitian matrices with nonnegative integral eigenvalues ai > • • • > a„, and bi > ■ ■ ■ > 
bji such that A + B has nonnegative integral eigenvalues ci > • • • > Cn, if and only if the Young 
diagram corresponding to (ci, . . . , c^) can be obtained from those of (oi, . . . , o^) and (6i, . . . , 6„). 



2.2 The General Complex Matrix Case 

Likewise, we study the problem 

r N 

min < II UjAjU* — vlo|| : Ui, . . . , Un G SU{n) unitary 

i 3=1 

for general complex matrices Aq, - ■ ■ A^ . Even for = 1, the result is highly nontrivial. In theory, 
it is related to the problem of determining whether and Ai are unitarily similar; see |31j- Also, 
to determine 

mm {\\UAU* -C*\\:U unitary} 
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for A,C £ Mn leads to the study of the C-numerical range and the C-numerical radius of A defined 

by 

W{C, A) = {tr (CUAU*) : U G SU{n)} , 

and 

r{C,A) =max{|^| : ^ G W{C,a)} . 

The C-numerical radius is important in the study of unitary similarity invariant norms on M„, 
i.e., norms u satisfy v[UXU*) = ^{X) for all X,U £ Mn such that U is unitary. For instance, it is 
known that for every unitary similarity invariant norm v there is a compact subset S of M„ such 
that 

v{X) = max {r{C, X) : C £ S} . 

So, the C-numerical radii can be viewed as the building blocks of unitary similarity invariant norms. 
We refer readers to the survey [22] for further results on the C-numerical range and C-numerical 
radius. For applications of C-numerical ranges in quantum dynamics, see also Ref. [29] 

For two matrices, one may study whether C = UAU* + VBV* for, e.g., a Hermitian A and a 
skew-Hermitian B. In other words, we want to study whether a matrix can be written as the sum 
of a Hermitian matrix and a skew-Hermitian matrix with prescribed eigenvalues. 

2.3 Sum of Hermitian and Skew-Hermitian Matrices 

For C = UAU* + VBV* with A = A* and B = —B*, there are many known inequalities relating 
the eigenvalues of A and B to the eigenvalues and singular values of C ; see fS] and the references 
therein. However, there has been no known necessary and sufficient condition for the existence 
of matrices A,B,C satisfying C = UAU* + VBV* with A = A* and B = -B* with prescribed 
eigenvalues or with prescribed singular values. Nevertheless, it is easy to solve the approximation 
problem 

mm {\\U* AU + V*BV -C\\ : C/, F unitary} . 

The following result actually holds for any unitarily invariant norm on n x n matrices using the 
same proof; see [2j|. Furthermore, we can use this result to verify that our algorithm indeed yield 
the optimal solution; see Example 2 in Section 2.5. 

Theorem 2.1 Let \\ ■ \\ be the Frobenius norm on Mn- Let A,B,C G M„ with A = A* and 
B = —B*. Suppose U,V Mn are unitary matrices such that U^{C + C*)U* = diag (/i, . . . , /„) 
with /!>•••> fn, and V^|(C — C*)V* = i diag (g'l, . . . ,(7n) with gi > ■ ■ ■ > g-n- Suppose A 
is unitarily similar to a diagonal matrix A\ (respectively, A'l) with diagonal entries arranged in 
descending (respecitively, ascending) order. Suppose —iB is unitarily similar to a diagonal matrix 
—iBi (respectively, —iB2) with diagonal entries arranged in descending (respecitively, ascending) 
order. Then 

n 

\\U*ArU + V*BrV-Cf = Y,{\fj-aj? + \gj-hj\^) 

n 

\\U*A2U + V*B2V -Cf = ^(|/,--a„_,-+ip + |g,--6,_,-+i|2) 

i=i 

and for any unitary X,Y £ Mn, 

\\U*AiU + V*BiV - C|| < \\X*AX + Y*BY - C\\ < \\U*A2U + V*B2V - C||. 
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Proof. Let F = i(C + C*) and G = ^{C - C*). It is well known that 



||F - U*AiU\\ < \\F - X*AX\\ < \\F - U*A2U\\ 

and 

\\G - V*BiV\\ < \\G - Y*BY\\ < \\G - V*B2V\\ 

for any unitary X,y € M„; see [21]. Since Hff+i-ftTp = for any Hermitian H,K £ M„, 

the results follow. ■ 

2.4 Deriving Gradient Flows on Unitary Similarity Orbits 

To begin with, we focus on the problem of approximating a given matrix C using matrices from 
two unitary similarity orbits, i.e., finding 

m.m{\\UAU* + VBV* -C\\:U,V € SU{n) unitary} . 

For simplicity, here we describe the steepest descent method to search for unitary matrices Uq^Vq 
attaining the optimum. Refined approaches like conjugate gradients, Jacobi-type or Newton-type 
methods may be implemented likewise, see for instance |3Q]. As will be shown below, more than 
two unitary similarity orbits can be treated similarly. The basic idea is to improve the current 
unitary pair (C/fc, 14) to (C/fc+i, V^+i) so that 

||[/,+iAC/fcV + Vk+iBV^+r - C\\ < WUkAU^ + V^BV^ - C\\ 

until the successive iterations differ only by a small tolerance, or the gradient {vide infra) vanishes. 
Further, to avoid pitfalls by local minima whenever the Euclidean distance cannot be made zero, 
we use a sufficiently large multitude of different random starting points {Uq, Vq) for our algorithm. 
Needless to say, a positive matching result is constructive, while a negative result may be due 
to local minima. It is therefore important to use a sufficiently large set of initial conditions for 
confident conclusions in the negative case. 

For a start, consider the least-squares minimization task 

min \\UAU* + VBV* - C\\l , (2.3) 

which can be rewritten as 

\\UAU* + VBV*-C\\l 

= \\UAU* + VBV*\\l + ||C7||2 - 2Retr {C*{UAU* + VBV*)} 

= + \\B\\l + \\C\\l - 2Retr {G*{UAU* + VBV*) - UAU* VB*V*} 

and thus is equivalent to the maximisation task 

max Retr iG*(UAU* + VBV*) - UAU* VB*V*} . (2.4) 

uyesuin) 

Therefore we set 

f{U, V) := tr {{UAU* + VBV*) C* - UAU* VB*V*] (2.5) 
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and F{U, V) := Re f{U, V). Then its Frechet derivative Diff{U) : TijU Tf(jj'^U can be seen as a 
tangent map, where the elements of the tangent space TuU to the Lie group of unitaries li = SU{n) 
or U{n) at the point U take the form OC/ with Vt = —Vl* being itself an element of the Lie algebra. 
The differential thus reads 

Duf{u){nu) = ti {{{nu)Au* + uA{nu)*){c* -VB*v*)} 

= tr {{{nU)AU* - UAU*{nU)U*){C* - VB*V*)} 

= tr {{AU*{C* - VB*V*) - U*{C* - VB*V*) UAU*)(SIU)} 

where we used the invariance of the trace under cyclic permutations and (JIC/)* = —U*{^U)U* , 
which follows from the product rule for D{l){ytU) = D{UU*){nU) = = {W)U* + U{nU)* in 
consistency with the Lie-algebra elements being skew-Hermitian. Moreover, by identifying 

DufiU) ■ {nU) = (grad^ f{U)\nU) = tr {(gradj, f{U)rnU} (2.6) 

one finds 

grad^ /([/) = {C- VBV*)UA* - UA*U*{C - VBV*)U = [{C - VBV*), UA*U*] U . 

With [X*,Y]s := - [X*,Y]*) = ^{[X*,Y] + [X,Y*]) as skew-hermitian part of the 

commutator one obtains for F{U) := Re/(C/) 

gradjjF{U) = [{C* -VB*V*),UAU*]^U . (2.7) 

Taking the respective Riemannian exponentials expjj (gradu F (U)) and expy(grady F(y)) thus 
gives the recursive gradient flows 

Uk+i = exp{-ak[UkAU*k,{C* -VkB*Vk*M Uk 
Vk+i = exp{-(3k[VkBVk*,{C* -UkA*U*k)]s} Vk 

as discretized solutions of the coupled gradient system 

il = gradf/ F{U, V) and V = grady F{U, V) . (2.8) 



Conditions for convergence are described in detail in [15]. For appropriate step sizes ak,/3k see also 
Ref. HU. 

Generalizing the flndings from a sum of two orbits to higher sums of unitary orbits is straight- 
forward: the problem 

r N 

min < II UjAjU* — vlo|| : Ui, . . . ,1/^ G SU (n) unitary 
( J=i 

can be addressed by the system of coupled gradient flows (j = 1, 2, . . . , A^) 

<\ = exp{-a^-)[4\^S,.] J (2.10) 

where for short we set A^-'^ := ujf^AjUjf^* and ^ojfc := - E "^k"^ ■ 

These gradient flows follow the extension of the original idea on the orthogonal group [3l [15] to 
the unitary group [T3], where here we introduce a larger system of coupled flows. 
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Figure 1: Coupled flows minimizing \ \J2jLi UjAjU* - A^^^Wj with (a) iV = 2 and (b) N = 10 for 
Example 1. 



2.5 Numerical Examples 

Here we demonstrate gradient flows minimising || X^jLi UjAjUj —Aq\\ over the unitaries Ui, . . . ,U]\[ 
for given Hermitian matrices Aq, - ■ ■ Af^f. 

Example 1 

As a test case, consider the following examples for finding Uj € C^'^^^'^. For j = 1, 2, . . . , choose 
a set of random unitaries C/j'^^ S C^'^^^'' distributed according to the Haar measure as recently 
described in [27] and define Aj := diag (1, 3, 5, ... , 19) + '^j^lio and A'^'^ := diag (ai, aio) where 
oi, a2, . . . , aio are the eigenvalues of Aq ^ := Ylf=i ^j^'^^j^j^'^ (and lio is the 10 x 10 unity matrix). 
As shown in Fig. [U the gradient flow of Eqn. 12.101 minimizes || X]j=i ^jAjU* — ^o^'*||2 by driving 
it practically to zero. Note that in Fig. [lb the combined flow on A^ = 10 unitaries converges even 
faster than in Fig. [1^, where N = 2 and the flow is more sensitive to saddle points as may be 
inferred from the jumps in trace (a). 

Example 2 

Let A,B be Hermitian and C arbitrary, e.g., A = (I 8 15 ) , B = ( 8 1^2 lo | , C = ( 6 9^ 3 ) . 

V 11 15 16/ \9 10 0/ \892/ 

Then a := eig{A) = (-5.6674; -0.4830; 32.1504), 6 := eig(5) = (-7.4816; 0.7123; 24.7693) and 
/ := eigi(C + C*) = (-4.9555; -1.3888; 18.3443), 5 := eig^(C-C*) = (-4.6368; 0; 4.6368). 
According to Theorem 2.1 one gets 

A := ^^m^^^WUAU* +iVBV* -C\\l = {a- f)*{a- f) + {b-g)*{b-g) = 605.8521 . (2.11) 
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More precisely A = 605.852131091'3004, while 100 runs of the gradient flow with independent 
random initial conditions give a mean it rmsd. of A = 605.85213109l'3570 it 1.13 • 10^^'^. 



3 Unitary Equivalence 

In this section, we study 



N 



min < II UjAjVj — Aq\\ : Ui, . . . ,Un G U{n) and Vi, . . . , G U{m) unitary 



for rectangular matrices Aq, . . . ,An- By the result of O'Shea and Sjamaar 

N 

min II ^ UjAjVj - Ao\\ = 



if and only if 



where 



N 



A, 



min II WjAjWj - io|| = 



A, 



A* 



for j = 0,l,...,iV. 
Thus, by the results concerning unitary similarity orbits (see Section 2), 



mm < 



N 



\Ao - ^ UjAjVjW :Ui,..., Un; Vi,...,Vn unitary 



(3.12) 



if and only if the singular values of vIq) ^i) • • • > satisfy a certain set of linear inequalities. Clearly, 
min{||^ — C/i?y|| : U,V unitary} = if and only if A and B have the same singular values. In 
general, it is interesting to check whether 

AT N 

V2mm II J2 UjAjVj - ^o|| = min || ^ W*AjWj - Ao\\ = 0. 

j=l 3=1 



In computer experiments (see Example 6 in Section 3), we observe that (j3.12p always holds if 
^1) • • • ) ^Af ^-re randomly generated matrices generated by matlab. We explain this phe- 
nomenon in the following. We begin with a simple observation. 

Lemma 3.1 Suppose ao,ai, . . . ,ajv G (0, oo). The following are equivalent. 
(a) There are complex units e**^, . . . , e**'^ such that ao — Ylf=i o-je^^i = 0. 



(b) There is an N + 1 side convex polygon whose sides have lengths oq, . . . , qat. 

TV 



(c) E^o «i - 2aA: > for all k = 0,1, N . 
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Form this observation, one easily gets the following condition related to the equality (j3.12p . 

Proposition 3.2 Let Aj = diag (aij, . . . , anj) be nonnegative diagonal matrices for j = 0,1, . . . , N , 
and let Vj = {aij, . . . , anjY- Then there exist permutation matrices Pi, ... , Pn and diagonal unitary 
matrices Di, . . . , Dj\f such that 

N 

Ao = Y.DjPjAjPj 

if and only if the entries of each row of the matrix 

[vo\Pivi \ ■ ■ ■ \Pnvn] 
correspond to the sides of a N + 1 side convex polygon. 

If one examines the singular values of an n x n random matrix generated by matlab, we see that 
there is always a dominant singular values of size about n/2, and the other singular values range 
from to 1.5n in a rather systematic pattern. So, it is often possible to apply Proposition 13.21 to 
get equality (I3.12[) if Aq, . . . , are random matrices generated by matlab for N >2. 

In contrast, for general matrices, it is easy to construct Aq, Ai, . . . , Aj\f such that (j3.12p fails. 
Example 3 

Let Aq = diag {N'^, iV + 1) © 0„_2 and Aj = diag {N, 1) © 0„_2 for j = 1, . . . , N. Then clearly 
Eqn. 13.12] does not apply, because 

n N n 

j=l i=l j=l 

Recall that the Ky Fan A;- norm of a matrix A G M„ is defined as ||^||fc = X^j=i ^ 
norm || • || on is unitarily invariant if ||^|| = ||i7Ay|| for all A E M„ and unitary U,V ^ Mn. 
By the Ky Fan dominance theorem, two matrices A,B€ Mn satisfy \\A\\k < \\B\\k for k = 1, . . . ,n 
if and only if ||^|| < ||i3|| for all unitarily invariant norms || • ||. In view of this example, we have 
the following result. 

Proposition 3.3 Suppose Ao,Ai,...,An E M„ satisfy \3.1^) . Then for all unitarily invariant 
norms, 

N 

2\\Ai\\<Y,\\Hl i = 0,l,...,iV, 
i=o 

and equivalently, for k = 1, . . . ,n, 

N 

2\\Ai\\k<^\\A^\\k, i = 0,l,...,N. (3.13) 

j=0 

Moreover, if there is k such that equality holds, then !i'j.ll3^) holds if and only if Aj is unitarily 

similar to Bj © Cj with Bj E M^ for j = 0, . . . , N such that 



mm < 



N 

\Bo — UjBjVj II : Ui, . . . , Un, Vi, . . . , Vn E are unitary ? = 
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and 

N 

mm < 



||Co - ^ XjCjYj II : Xi,..., Xn,Yi, . . . ,Yn e Mn-k are unitary 



It would be nice if one can get (j3.12p by checking the relatively easy condition (j3.13p . Unfortu- 
nately, the following example shows that it is not true. 

Example 4 

Let Aq = diag(14,2), Ai = diag(8,0), A2 = diag(7,4). Then (f34^ is satisfied for all A: > 1 but 
by the result in [23] . 

diag {UiAiVi + U2A2V2) + (14,2) 

for all unitaries f/j, Vy 

3.1 Deriving Gradient Flows on Unitary Equivalence Orbits 

For minimizing ||C/yiy ~ C'Hl one has to maximize 

F{U,V) := Retr {UAVC*} = ^tr {UAVC* + (UAVC*)*} . 
By the same arguments as before, from its Frechet differential 

DuF{U, V)inU) = itr {{nU)AVC* - CV*A*U*{nU)U*} = itr {{AVC* - u*cv*A*u*){nu)} 

one obtains the gradient — where henceforth we keep writing {■)s for the skew-Hermitian part 

grader F(;7, y) = 1{AVC* - U*CV*A*U*)* = -{UAVC*)s U . 

An analogous result follows for grady F{U,V). Taking again the respective Riemannian exponen- 
tials leads to the recursive scheme 

Uk+i = exp{-ak{UkAVkC*)s} Ut 

Vk+i = e^v{-Pk{ykC*UkA)s] Vk, 

which also can be used, e.g., for a singular- value decomposition of A by choosing C real diagonal. 

Likewise, minimizing \\UAV+XBY-C\\l by maximizing Re tr {UAV[C - XBYf + XBYC*} 
translates into the same flows when substituting C ^ (C — XkBYk) with analogous recursions for 
Xk+i and Yfc+i. Along these lines, it is straightforward to address the general task 

r N \ 

mm. I \\'^U j AjVj - Aq\\-.Ui,...,Un ^U{n) and Fi, Vat E C/(m) unitary > (3.14) 

with rectangular matrices Aq, . . . , ^Iat by a system of 2A^ coupled gradient flows (j = 1, 2, . . . , N) 

Ul% = exp{-4^-)(^i^-)^,yi^-)AS,,).} Ui;' (3.15) 
Vi2, = exp{-/3(^-)(y(^-Us,,[/l^-U,).} Fi^^) (3.16) 

where we use the short-hand A^ju := - ^ U\^^uV^ ■ 
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Figure 2: Coupled flows minimizing \\T,f=iUjAjVj - A\f'\\l with (a) iV = 2 and (b) N = 10. 
Here the Aj G C'^0xi5 ^^^.g rectangular so that Uj £ c'^oxio y. ^ q15x15^ 

3.2 Numerical Examples 

Using the flows derived in section \37L\ in this section, we study 

r A. ^ 

min < II UjAjVj — ^o|| '■ Ui, . . . , Un S U (n) and Vi, . . . , Vn S U{m) unitary > 
I J 

for rectangular matrices ^40, . . . , ^at. 
Example 5 

As an example of rectangular Aj G qIOxis^ consider the analogous flows. In order to obtain 
Uj e C^°^^° and Vj e C^^^^^ for j = 1, 2, . . . , iV choose a set of random unitary pairs {U^^\yj'''^) ^ 

ClOxlO X C15X15 g^^^ jgfij^g 

:= [diag(l,3,5,...,19) + ^lio | Oio.s ] and := [diag (si, sio) | ©10,5] 

where si, S2, • • • , siq are now the singular values of A'q ^ := J2f=i ^j'^^'^j^j'^^ ©10,5 is the 10 x 5 
zero-matrix. Fig. [2] shows how the coupled gradient flow minimizes || X^^Li ^j^j^j ~ ^o^^lli 

driving it practically to zero. Again the combined flow on = 10 unitary pairs (Fig. [2)3) converges 
faster than the one for N = 2 unitary pairs given in Fig. [2^. 

3.2.1 Observation Concerning Sums of Unitary Equivalence Orbits 

A non-zero random complex matrix Aq is typically distant from a single equivalence orbit of another 
(non-zero) random matrix UAiV of the same dimension, since generically ^0 and Ai clearly do 
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Figure 3: A random complex square matrix Aq E cioxio jg typically distant from a single (A^ = 1) 
equivalence orbit of another random square matrix UAiV, as shown in the upper trace. However, 
it is typically arbitrarily close to a sum of equivalence orbits of several independent random square 
matrices as demonstrated in the lower traces: \\J2f=i^j^j'^j ~ ^olli — >■ for = 2,3,4,5,10. 
In contrast, the inset shows this does not hold for A^ = 1 through A^ = 10 for similarity orbits 



N 



not share the same singular values. However, a random complex matrix Aq is in fact typically 
arbitrarily close to a sum of two or more equivalence orbits of independent random matrices. This 
is shown in Fig. [3] by a numerical example for 10 x 10 complex square matrices, where the inset 
shows this does not hold for similarity orbits of random square matrices. Interestingly, the findings 
hold independent of the dimensions and explicitly include rectangular matrices as well as square 
matrices. 

Example 6 

For a single random complex square matrix Aq £ cioxio j^^^ ^low close it typically is to the 
sum of A^ = 1, 2, 3, 4, 5, 10 equivalence orbits Ylf=i UjAjVj, where the Aj are independently chosen 
random complex matrices Aj S qIOxio^ compare the findings with those of A^ independent 
similarity orbits X^jLi ^j^j^j results of Fig. [3] underscoring Proposition 3.2. 

4 Unitary ^- Congruence 

In this section, we consider 

r N 

min < II ^ UjAjUj - Aq\\ : Ui, . . . ,Un £ U{n) unitary 
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for given matrices Aq, Ai, . . . , A]\f . Sometimes, we can focus on special classes of matrices such 
as symmetric matrices or skew-symmetric matrices. For symmetric matrices or skew-symmetric 
matrices, the minimization problem 

mm {\\UAU^ - Ao\\ : U unitary) 

has an analytic solution; see [2^. The problem is wide open even if = 2. Therefore, a computer 
algorithm will be most helpful in the theoretical development. One may also consider whether we 
can have UAU^ + VBV* = C for a symmetric A and a skew-symmetric B. In other words, we 
want to know whether one can write C as the sum of symmetric and skew-symmetric matrices 
with prescribed singular values. Of course, the problem for general matrices A, B and C is even 
more challenging, and that is what we pursue by the numerical methods developed in the next 
paragraph. 

4.1 Gradient Flows on Unitary t-Congruence Orbits 

Again, the minimization task 

min \\UAU^ + VBV^ -C\\l, (4.17) 

translates via 

\\UAU^ + VBV^ -C\\l = \\A\\l + \\B\\l + ||C7||2 - 2Retr {C*{UAU^ + VBV^) - UAU^ VB*V*} 
into maximising the function 

F{U,V) ■.= Ref{U,V) ■.= ReiT {{UAU^ + VBV^)C* -UAU^ VB*V*} , (4.18) 
where the differential reads (by virtue of the short-hand C := C* — VB*V*) 

Dufiu)inu) = tv {{{nu)Au^ + uA{nuY){c* -VB*v*)} 

= tr |(J7C/)AC/*C'} +tr ^{UA{nuYCY^ 
= tr |(AC/*(7 + yl*J7*C'*)(17C/)} . 

From identifying Duf{U) • {QU) = {gTad^ f {U)\nU) = tr {{gvadu f{U))*nU} one finds 

grad^ f{U) = (UAU^C + UA^U^C^yU (4.19) 

so as to obtain for F{U) := Re/(C/) 

gradf/ F{U) = -{UAU^C + UA^U^C^) ^ U . (4.20) 

Again, taking the respective Riemannian exponentials expf;(grad[; F(C/)) and expy (grady F(y)) 
thus gives the slightly lengthy formula 

Uk+i = exp {-ak{UkAUi{C* - VkB*V,*) + UkA'ul{C* - V1.S*T4*)*) J Uk (4.21) 
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— and an analogous equation for V^+i by substituting V for U and B for A — as discretized solutions 
of the coupled gradient system 

U = giadu F{U, V) and V = grady F{U, V) . (4.22) 
Likewise, for higher sums of congruence orbits one finds 

r N ] 

min < II UjAjU] - Ao\\ : Ui, . . . ,Un ^ U{n) unitary \ (4.23) 

to be solved by the coupled system of flows (j = 1, 2, . . . , A'') 

ui% = e.v{-af{AfAl,, + {Al,,Aff)]ui^^ , (4.24) 

where for short we set ^^^'^ := uj.^^ AjUl^^'^ and Aojk := Aq - E ^^j^^ ■ 



v=l 



5 Outlook: Non-Compact Groups 

For orbits S{A) of matrices A under the action of non-compact groups, there are usually no good 
results for supremum or infinmum of the quantity 



N 



with Xj G S{Aj) for j = 0,1, . . . ,N, for given matrices Aq, . . . , A^. 
For example, for the invertible congruence orbit of ^ G M„ 



S{A) = {S*AS : 5 e M„ is invertible} , 

we can let S = rl. Then 

AT 

\\S*AoS -J2s*AjS\\ 

converges to or oo depending on r ^ or r ^ oo. 

Similarly, the same problems occur for the equivalence orbit of A G M„ 

S{A) = {SAT :S,TeMn are invertible} . 

For the similarity orbits, we have the following. 

Proposition 5.1 Suppose not all the matrices Aq, . . . ,An are scalar. Then 

N 

sup 11^0 — '^j'^Aj'^jW = °o. 
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Proof. Suppose one of the matrices, say, Ai is non-scalar. Then there is Sj such that S~ AiSj 

is in lower triangular form with the (2, 1) entry equal to 1, and there are invertible matrices 5^- 

such that S~^AjSj is in upper triangular form for other j. Let Dr = diag (r, 1, 1, . . . , 1). Then the 
sequence 

N 

{SoDr)-^AoiSoDr) - Y,iSjDr)-'Aj{SjDr) 

has unbounded (2, 1) entry as r ^ oo. The conclusion follows. ■ 
Determining 

N 

mf\\Ao-Y,SrU^Sj\\ 

is more challenging. Let us first consider two matrices A,B£ M„. We have the following. 
Proposition 5.2 Let A,BE: Mn- Then for any unitary similarity invariant norm \\ ■ \\, 

\\{tiA-tiB)I/n\\ < \\S-^AS -T-'^BT\\ 

for any invertible S and T. 

Proof. Given two real vectors x = {xi, . . . ,Xn),y = {yi, ■ ■ . ,yn), we say that x is weakly 
majorized by y, denoted by x y if the sum of the k largest entries of x is not larger than 
that of y for k = 1, . . . , n. By the Ky Fan dominance theorem, if X = diag (xi, . . . , x„) and Y = 
diag {yi, . . . , yn) are nonnegative matrices such that {xi, . . . , Xn) -<w {vi, ■ ■ ■ , Vn)-, then ||X|| < ||y || 
for any unitarily invariant norm || • ||. 

Now, suppose S~^AS — T~^BT has diagonal entries c?i, . . . , c?„ and singular values si, . . . , s„. 
Then 

n n 

\trA-trB\ = \ ^dj\ < '^\dj\. 

3=1 3=1 

Thus, 

|tr^-trS|(l,...,l)/n {\di\,...,\dn\) 

It follows that 

\\{trA-tTB)I/n\\ < ||diag |, . . . , < ||diag (si, . . . , = \\S-^AS-T-^BT\\. . 
Can we always find invertible S and T such that 

\\S-^AS -T'^BTW = ||(trA-trS)7/n||? 
The answer is no, and we have the following. 

Proposition 5.3 Let \\ • \\ be a unitarily invariant norm on Mn- Suppose A G M„ has eigenvalues 
ai, . . . , an, and B = bl. Then 

'mi{\\S~^AS - B\\ : S e Mn is invertible} = ||diag (ai - 6, . . . , - b)\\. 
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Proof. Suppose S ^AS — B has eigenvalues ai — b, . . . ,an ~ b, and singular values si, . . . , s„. 
Then the product of the k largest entries of the vector (|ai — 6|, . . . , |a„ — 6|) is not larger than 
(si, . . . , Sji) ioi k = 1, . . . ,n. It follows that 

(|ai-6|,...,|an-6|) (si, • • • , Sn), 

and hence 

||diag(|ai-6|,...,|a„-6|)|| < ||diag (si, . . . , = \\S~''AS - B\\. 

Note that there is S such that S~^{A — B)S is in upper triangular Jordan form with diagonal 

entries ai - 6, . . . , a„ - 6. Let = diag (1, r, . . . , r"-^) for r > 0. Then {SDr)-'^{A - B)(SDr) 

diag{ai -b,..., an -b) and \\{SDr)~^{A - B){SDr)\\ ||diag (oi - 6, . . . , - as r ^ 0. So, 
we get the conclusion about the infinmum. ■ 

From the above result and proof, we see that if A has an eigenvalue a with eigenspace of 
dimension p and B has an eigenvalue b with eigenspace of dimension q such that p-\-q — n = r>0 
then S~^AS — T~^BT has an eigenvalue a — 6 of multiplicity at least r. The question is whether 
we can write A = air © Ai and B = blr® Bi and show that 

inf||5f%5i-T-iBiri|| = ||(trAi-trBi)4_fc/(n-A;)||. 

It is interesting to note that the following two quantities may be different. 

1) inf - T-^BT\\ : S is invertible}. 

2) inf (IIS-^AS - 5 II : 5 is invertible}. 

/ 1 ( 

For example, suppose A = diag (2, — 1, — 1) and S = I 1 | ■ Then there are invertible 

V ( 

S and T such that 

^0 1 1^ 

S-^AS = I 1 1 1 and T'^BT = I 1 I . 




.0 Oj 

So, C = S~^AS — T'^BT is a rank two nilpotent. Thus for any e > 0, there is an invertible 
such that 

/O e 0^ 
R~^CRe = e 
VO Oy 

As a result, 

\\R-^S-'^ASRe - RJ^T-^BTR^W ^0 as e ^ 0. 
So, the quantity in (1) equals zero. On the other hand, for every invertible 5, we have 

II {A-SBS-'^) {Sei)\\ = \\A{Sei)\\ > \\Sei\\ 

Therefore, inf ||yl — SBS~^\\ > 1. So, we see that the quantities in (1) and (2) may be different. 
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In connection to the above discussion, it is interesting to study the following problem. 

1. Determine 

inf { WS^'^AS - TBT-'^W : S, T are invertible} 
and characterize the matrix pairs {A,B) 

2. Determine 

inf |||S'"^74S' — B\\ : S is invertible} 
and characterize the matrix pairs (A, B) attaining the infinmum if they exist. 

6 Conclusions 

We have treated the least-squares approximation problems by elements on the sum of various matrix 
orbits including unitary similarity, equivalence and congruence. Special attention has been paid 
to sums of unitary similarity orbits of a Hermitian A and a skew-Hermitian i?, where theoretical 
results have been obtained and shown to be consistent with numerical findings. Further, new results 
on unitary equivalence orbits have been obtained stimulated by numerical experiments, are related 
to geometric arguments. 

A general framework based on the gradient flows on matrix orbits arising from Lie group actions 
has been developed to study the proposed problems. The gradient flows devised to this end extend 
the existing toolbox (see e.g. [21 E]) by referring to sums of matrix orbits as summerized in Tab. 1. 
This general approach can be used to treat many problems in theory and applications. For instance, 
flows on such sums of unitary similarity orbits can also be envisaged as on unitaries taking a block- 
diagonal form, and hence they relate to relative C numerical ranges, where the group action is 
restricted to a compact subgroup K C SU(n) of the full unitary group [29] . Finally, first results 
on matrix orbits under non-compact group actions invite further research. 

7 Further Research 

In order to avoid the search in our algorithms is terminated in local extrema, one has to ensure 
to choose a sufficiently large set of random unitaries distributed according to the Haar measure. 
Actually, one knows there are commutation properties at the critical points. It would be nice to 
find a more efficient method to choose starting points for the search, and prove theorems ensuring 
that the absolute minimum will be reached from one of these starting points using our algorithms. 

Our discussion focused on orbits of matrices under actions of compact groups. We can consider 
other orbits under actions of non-compact groups. Here are some examples for S,T £ SL{n,C): 

(e) the general similarity orbit of a square matrix A is the set of matrices of the form SAS~^, 

(f) the equivalence orbit of a rectangular matrix A is the set of matrices of the form SAT, 

(g) the *-congruence orbit of a complex square matrix A is the set of matrices of the form SAS* , 

(h) the f-congruence orbit of a square matrix A is the set of matrices of the form SAS*. 
However, the fact that GL(n,C) and SL{n,C) are just locally compact entails there is no Haar 
measure and consequently no bi-invariant metric on the tangent spaces, but only left or right- 
invariant metrics. Hence the Hilbert-Schmidt scalar product {B\A) = tr {B*A} has to be treated 
with care, in particular since we are interested in the complex domain. Moreover, while in compact 
Lie groups the exponential map is surjective and geodesic [1], in locally compact Lie groups, it is 
generically neither surjective nor geodesic. It is for these reasons that devising gradient flows in 
locally compact Lie groups is the subject of a follow-up study. 
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Table 1: Summary of Least-Squares Approximations by Matrix Orbits and Related Gradient Flows 



type and objective 



coupled gradient flows 



unitary similarity: 

N 

min =\\J2UjAjU*-Ao\ 



UeSU{n) 



[/a=exp{-4^)[4^),A*.,]J Ui^^ 

and Aojk := 



where A^^^ := ul^^AjU^^^' 



An 



N 

i/=i 



k 



unitary equivalence: 
N 

min II y; UjA-jVi - Ao\ 
u,VeSU{n)" j^i 



C/gi = exp 



).} 



where A, 



Ojk 



A* TT^^A- 

N 

- E K'A^v, 



k 



unitary congruence: 

N 

min \\EUjAjm-Ao\ 

U&SU(n) j^l ■> ■> ^ 



t ^ 
where A^^^ := u'^^ AjU'^^ and A^j^ := - E ^k^ 

i/=i 
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